首页> 外文OA文献 >Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

【2h】

Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

机译：使用Deep进行街景图像的多位数识别卷积神经网络

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recognizing arbitrary multi-character text in unconstrained naturalphotographs is a hard problem. In this paper, we address an equally hardsub-problem in this domain viz. recognizing arbitrary multi-digit numbers fromStreet View imagery. Traditional approaches to solve this problem typicallyseparate out the localization, segmentation, and recognition steps. In thispaper we propose a unified approach that integrates these three steps via theuse of a deep convolutional neural network that operates directly on the imagepixels. We employ the DistBelief implementation of deep neural networks inorder to train large, distributed neural networks on high quality images. Wefind that the performance of this approach increases with the depth of theconvolutional network, with the best performance occurring in the deepestarchitecture we trained, with eleven hidden layers. We evaluate this approachon the publicly available SVHN dataset and achieve over $96\%$ accuracy inrecognizing complete street numbers. We show that on a per-digit recognitiontask, we improve upon the state-of-the-art, achieving $97.84\%$ accuracy. Wealso evaluate this approach on an even more challenging dataset generated fromStreet View imagery containing several tens of millions of street numberannotations and achieve over $90\%$ accuracy. To further explore theapplicability of the proposed system to broader text recognition tasks, weapply it to synthetic distorted text from reCAPTCHA. reCAPTCHA is one of themost secure reverse turing tests that uses distorted text to distinguish humansfrom bots. We report a $99.8\%$ accuracy on the hardest category of reCAPTCHA.Our evaluations on both tasks indicate that at specific operating thresholds,the performance of the proposed system is comparable to, and in some casesexceeds, that of human operators.

机译：在无约束的自然照片中识别任意多字符文本是一个难题。在本文中，我们解决了这个领域中的一个同样困难的子问题。从街景图像中识别任意的多位数。解决此问题的传统方法通常将定位，分割和识别步骤分开。在本文中，我们提出了一种统一的方法，该方法通过使用直接在图像像素上运行的深度卷积神经网络来集成这三个步骤。我们采用深度神经网络的DistBelief实现，以便在高质量图像上训练大型分布式神经网络。我们发现，这种方法的性能随着卷积网络的深度而增加，在我们训练的最深架构中具有11个隐藏层的情况下表现最佳。我们在可公开获取的SVHN数据集上评估了该方法，并在识别完整街道编号时获得了超过$ 96 \％$的准确性。我们显示，在每位数的识别任务上，我们对最新技术进行了改进，达到了$ 97.84 \％$的准确性。我们还将在由街景图像生成的更具挑战性的数据集上评估该方法，该数据集包含数千万条街道编号注释，并达到超过$ 90 \％$的准确性。为了进一步探索所提出的系统对更广泛的文本识别任务的适用性，我们将其应用于reCAPTCHA的合成失真文本。 reCAPTCHA是最安全的反向测试之一，它使用变形的文本来区分人与机器人。我们报告的reCAPTCHA最困难类别的准确性为$ 99.8 \％$。我们对这两项任务的评估表明，在特定的操作阈值下，拟议系统的性能可与人工操作者媲美，甚至在某些情况下超过了操作员。

著录项

作者
Goodfellow, Ian J.; Bulatov, Yaroslav; Ibarz, Julian; Arnoud, Sacha; Shet, Vinay;
展开▼
作者单位

展开▼
年度 2014
总页数
原文格式 PDF
正文语种 {"code":"en","name":"English","id":9}
中图分类

相似文献

外文文献
中文文献
专利

1. Deep convolutional neural network training enrichment using multi-view object-based analysis of Unmanned Aerial systems imagery for wetlands classification [J] . Liu Tao, Abd-Elrahman Amr ISPRS Journal of Photogrammetry and Remote Sensing . 2018,第MAY期

机译：基于多视图基于对象的无人航空系统图像湿地分类的深度卷积神经网络训练充实
2. Spatio–Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks ? [J] . Huy Hieu Pham, Houssam Salmane, Louahdi Khoudour, Sensors . 2019,第8期

机译：3D骨骼运动的时空图像表示，用于深度卷积神经网络的视图不变动作识别？
3. A robust similarity based deep Siamese convolutional neural network for gait recognition across views [J] . George Merlin Linda, Govindarajan Themozhi, Rajasekaran Kavitha Angamuthu, Computational Intelligence . 2020,第3期

机译：基于强大的深度暹蒙卷积神经网络，用于跨视图的步态认可
4. Urban Street Contexts Classification Using Convolutional Neural Networks and Streets Imagery [C] . Fahad Alhasoun, Marta González IEEE International Conference on Machine Learning and Applications . 2019

机译：基于卷积神经网络和街道图像的城市街道语境分类
5. Hyperparameter Optimization of Deep Convolutional Neural Networks Architectures for Object Recognition [D] . Albelwi, Saleh. 2018

机译：深度卷积神经网络体系结构用于对象识别的超参数优化
6. Motor Imagery EEG Signal Recognition Using Deep Convolution Neural Network [O] . Xiongliang Xiao, Yuee Fang 2021

机译：电机图像EEG信号识别使用深卷积神经网络
7. When Face Recognition Meets with Deep Learning: an Evaluation of Convolutional Neural Networks for Face Recognition [O] . Hu, G, Yang, Y, Yi, D, 2015

机译：当人脸识别与深度学习相遇时：用于人脸识别的卷积神经网络评估

Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

摘要

著录项

相似文献

相关主题

期刊订阅